Graph-Based Modeling of ETL Activities with Multi-level Transformations and Updates
نویسندگان
چکیده
Extract-Transform-Load (ETL) workflows are data centric workflows responsible for transferring, cleaning, and loading data from their respective sources to the warehouse. In this paper, we build upon existing graph-based modeling techniques that treat ETL workflows as graphs by (a) extending the activity semantics to incorporate negation, aggregation and selfjoins, (b) complementing querying semantics with insertions, deletions and updates, and (c) transforming the graph to allow zoom-in/out at multiple levels of abstraction (i.e., passing from the detailed description of the graph at the attribute level to more compact variants involving programs, relations and queries and vice-versa).
منابع مشابه
Modeling ETL activities as graphs
Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction of data from several sources, their cleansing, customization and insertion into a data warehouse. In this paper, we focus on the logical design of the ETL scenario of a data warehouse. Based on a formal logical model that includes the data stores, activities and their constituent parts, we model ...
متن کاملOrder - Aware Etl Workflows
Tziovara, Vasiliki, A. MSc, Computer Science Department, University of Ioannina, Greece. October, 2006. Order-Aware ETL Workflows. Thesis Supervisor: Panos Vassiliadis. Data Warehouses are collections of data coming from different sources, used mostly to support decision making and data analysis in an organization. To populate a data warehouse with up-to-date records that are extracted from the...
متن کاملSystematic ETL management - Experiences with high-level operators
Large organizations load much of their data into data warehouses for subsequent querying, analysis, and data mining. Extract-Transform-Load (ETL) workflows populate those data warehouses with data from various data sources by specifying and executing a set of transformations forming a directed acyclic transformation graph (DAG). Over time, hundreds of individual ETL workflows evolve as new sour...
متن کاملA Framework for ETL Systems Development
There are many commercial Extract-Transform-Load (ETL) tools, of which most of them do not offer an integrated platform for modeling processes and extending functionality. This drawback complicates the customization and integration with other applications, and consequently, many companies adopt internal development of their ETL systems. A possible solution is to create a framework to provide ex...
متن کاملFormal Semantics of Consistent EMF Model Transformations by Algebraic Graph Transformation
Model transformation is one of the key activities in model-driven software development. An increasingly popular technology to define modeling languages is provided by the Eclipse Modeling Framework (EMF). Several EMF model transformation approaches have been developed, focusing on different transformation aspects. To validate model transformations wrt. functional behavior and correctness, a for...
متن کامل